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Abstract 

Partial Labeled Markov Chains are simultaneously generalizations of process algebra and of traditional 
Markov chains. They provide a foundation for interacting discrete probabilistic systems, the interaction 
being synchronization on labels as in process algebra. Existing notions of process equivalence are too 
sensitive to the exact probabilities of various transitions. This paper addresses contextual reasoning princi- 
ples for reasoning about more robust notions of “approximate” equivalence between concurrent interacting 
probabilistic systems. 

• We develop a family of metrics between partial labeled Markov chains to formalize the notion of 
distance between processes. 

• We show that processes at distance zero are bisimilar. 

• We describe a decision procedure to compute the distance between two processes. 

• We show that reasoning about approximate equivalence can be done compositionally by showing 
that process combinators do not increase distance. 

• We introduce an asymptotic metric to capture asymptotic properties of Markov chains; and show 
that parallel composition does not increase asymptotic distance. 
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1 Introduction 


Probability, like nondeterminism, is an abstraction mechanism used to hide inessential or unknown details. 
Statistical mechanics — originated by Boltzmann, Gibbs, Maxwell and others — is the fundamental success- 
ful example of the use of the probabilistic abstraction. Computer science, and process algebraic theories in 
particular, are focussed on providing compositional reasoning techniques. Our investigations are concerned 
with the development of contextual reasoning principles for concurrent interacting probabilistic systems. Con- 
sider the following paradigmatic examples. 

Example 1.1 [AJKv097] analyzes a component (say c) of the Lucent Technologies * 5ES& telephone switch- 
ing system that is responsible for detecting malfunctions on the hardware connections between switches 1 . This 
component responds to alarms generated by another complicated system that is only available as a black-box . 
A natural model to consider for the black-box is a stochastic one , which represents the timing and duration of 
the alarm by random variables with a given probability distribution. [AJKv097] then shows that the desired 
properties hold with extremely high probability , showing that the component being analyzed approximates the 
desired idealized behavior (say i) with sufficient accuracy. 

Example 1.2 Consider model-based diagnosis settings. Often information about failure models and their 
associated probabilities is obtained from field studies and studies of manufacturing practices. Failure models 
can be incorporated by assigning a variable ; called the mode of the component , to represent the physical state 
of the component , and associating a failure model with each value of the mode variable. Probabilistic infor- 
mation can be incorporated by letting the mode vary according to the given probability distribution [dKW89]. 
The diagnostic engine computes the most probable diagnostic hypothesis , given observations about the cur- 
rent state of the system. 

These examples illustrate the modes of contextual reasoning that interest us. In the first example, we are 
interested in exploring whether c can substitute for i in arbitrary program contexts; i.e. for some context C[], 
does C[c] continue to approximate C[i]. Similarly, in the second example, we are looking to see the extent 
to which systems with similar failure behaviors are intersubstitutable. Such a question perforce generalizes 
the study of congruences elaborated by the theory of concurrency. The theory of concurrency performs a 
study of “exactly intersubstitutable” processes with temporal behavior. In the probabilistic context, the extant 
notions of bisimulation (or any process equivalence for that matter) are too sensitive to the probabilities; a 
slight perturbation of the probabilities would make two systems non-bisimilar. The examples motivate a shift 
to the study of the more robust notion of “approximately intersubstitutable”. 

The next example illustrates a deeper interaction of the temporal and probabilistic behavior of processes. 

Example 1.3 Consider a producer and a consumer process connected by a buffer, where the producer is say 
a model of a network. Examples of this kind are studied extensively in the performance modeling of systems. 
In a model of such a system , probability serves to abstract the details of the producer (resp. consumer) 
process by considering rates of production (resp. consumption ) of data based on empirical information. This 
model can be analyzed to calculate the number of packets lost as a function of the probabilities and the buffer 
size. The analysis aids in tuning system parameters , e.g. to optimize the buffer size . These studies are often 
couched in terms of asymptotic/stationary behavior to abstract over the transient behavior associated with 
system initialization (such as large bursts of communication) evident when the system begins execution. 

Such examples motivate the study of equality notions based on “eventually approximately intersubstitutable” 
processes. 

J For another instance of modeling a complex environment that is best done statistically, see fGat95]. 
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LI Our results 

Partial labeled Markov chains (plMc) are the discrete probabilistic analogs of labeled transition systems. In 
this model “internal choice” is modeled probabilistically and the so-called “external choice is modeled by 
the indeterminate actions of the environment. The starting point of our investigation is the study of strong 
bisimulation for plMc. This study was initiated by [LS91] for plMc in a style similar to the queuing theory 
notion of “lumpability”. This theory has been extended to continuous state spaces and continuous distribu- 
tions [BDEP97, DEP98]. These papers showed: 

• Bisimulation is an equivalence relation. 

• The logic C given by 4> T | <f>i A <p 2 | is complete for bisimulation 2 

In the context of the earlier discussion, we note that probabilistic bisimulation is too exact for our 
purposes — intuitively, two states are bisimilar only if the probabilities of outgoing transitions match ex- 
actly, motivating the search for a relaxation of the notion of equivalence of probabilistic processes. Jou and 
Smolka [JS90] note that the idea of saying that processes that are close should have probabilities that are 
close does not yield a transitive relation, as illustrated by an example of van Breugel [Bre]. This leads them 
to propose that the correct formulation of the “nearness” notion is via a metric. 

A metric d is a function that yields a real number distance for each pair of processes. It should satisfy 
the usual metric conditions: d(P,Q) = 0 implies P is bisimilar to Q , d(P, Q) = d(Q,P) and d(P, R) < 
d(P, Q) + d(Q< R ). Inspired by the Hutchinson metric on probability measures [Hut81], we demand that d 
be “Lipschitz” with respect to probability numbers, an idea best conveyed via a concrete example. 

Example 1.4 Consider the family o/plMcs {P € | 0 < e < r} where P t = a r - e .Q, i.e. P f is the plMc that 
does an a with probability r — e and then behaves like Q. We demand that: 

d{P, x ,P t2 ) < |€i -C 2 |. 

This implies that P € converges to Pq as e tends to 0. 

Metrics on plMcs. Our technical development of these intuitions is based on the key idea expounded by 
Kozen [Koz85] to generalize logic to handle probabilistic phenomena. 


Classical logic 

Generalization 

Truth values {0, 1} 
Propositional function 
State 

Evaluation of prop, functions 

Interval [0, 1] 
Measurable function 
Measure 
Integration 


Following these intuitions, we consider a class T of functions that assign a value in the interval [0, 1] to states 
of a plMc. These functions are inspired by the formulas of £ — the result of evaluating these functions at 
a state corresponds to a quantitative measure of the extent to which the state satisfies a formula of £. The 
identification of this class of functions is a key contribution of this paper, and motivates a metric d: 

d(P , Q) = sup{| f(sp) - /{sq ) | | / € F). 

In section 4, we formalize the above intuitions to define a family of metrics {d c \ c € (0, 1]}. These 
metrics support the spectrum of possibilities of relative weighting of the two factors that contribute to the 

2 a is a label, q is a rational. (a)q<p holds in a state s if s has probability > q of making an o-transition to the set of states satisfying 
<p. Note that such a characterization of bisimulation using a negation-free logic is a new result even for discrete systems. 
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distance between processes: the complexity of the functions distinguishing them versus the amount by which 
each function distinguishes them. d l captures only the differences in the probability numbers; probability 
differences at the first transition are treated on par with probability differences that arise very deep in the 
evolution of the process. In contrast, d c for c < 1 give more weight to the probability differences that arise 
earlier in the evolution of the process, i.e. differences identified by simpler functions. As c approaches 0, the 
future gets discounted more. 

As is usual with metrics, the actual numerical values of the metric are less important than the notions of 
convergence that they engender 3 . Our justification of the metrics will rely on properties like the significance 
of zero distance, relative distance of processes, contractivity and the notion of convergence rather than a 
detailed justification of the exact numerical values. 

Example 1.5 Consider the plMc P with two states, and a transition going from the start state to the other 
state with probability p. Let Q be a similar process, with the probability q. Then in section 4, we show that 
d c (P,Q) = c\p — q\. Now if we consider P' with a new start state, which makes a b transition to P with 
probability 1, and similarly Q’ whose start state transitions to Q on b with probability 1, then d c {P'. Q’) = 
c 2 \p — q\, showing that the next step is discounted by c. 

Each of these metrics agree with bisimulation: 

d c {P.Q) — 0, iff P and Q are bisimilar. 

For c < 1, we show how to evaluate d c {P, Q) to within an e-error for finite state processes P , Q. 

An “asymptotic” metric on plMc. The d c metric (for c < 1) is heavily influenced by the initial transitions 
of a process — processes which can be differentiated early are far apart. For each c G (0, 1], we define a dual 
metric d ^ (Section 6) on plMcs to capture the idea that processes are close if they have the same behavior 
“eventually”, thus disregarding their initial behavior. Informally, we proceed as follows. Let P after s stand 
for the plMc P after exhibiting a trace s. Then, the j’th distance dj between P, Q after exhibiting traces of 
length j is given by 

sup {d c {P after s, Q after s) | length(s) = j). 

The asymptotic distance between P, Q is given by the appropriate limit of the dj's: 

d C oo(P, Q) = limsup d C j(P , Q). 

i—>oc j>i 


A process algebra of probabilistically determinate processes. In order to illustrate the properties of the 
metrics via concrete examples, we use an algebra of probabilistically determinate processes and a (bounded) 
buffer example coded in the algebra (Section 5). This process algebra has input and output prefixing, parallel 
composition and a probabilistic choice combinator. We do not consider hiding since this paper focuses on 
strong (as opposed to weak) probabilistic bisimulation. 

We show that bisimulation is a congruence for all these operations. Furthermore, we generalize the result 
that bisimulation is a congruence, by showing that process combinators do not increase distance in any of the 
d c metrics. Formally, let d c (P l . Qi) = e,. For every n-ary process combinator C{X i, . . . , X n ], we have 

d c ( c ( p u ... ,p b ),c(c?i,... ,<?„))<;>>• 


? We take the uniformity view of metrics, e.g. see [Bou89]. Intuitively, a uniformity captures relative distances, e.g. if x is closer 
to c than y\ it ignores the numerical distances. For example, a uniformity on a metric space M is induced by the collection of sets 
K t = {( r , y) e M x M | d{ t, y) < (} - note that different metrics may yield the same uniformity. 


We show that the prefixing and parallel composition combinators do not increase the asymptotic distance 
d c OQ . However, the probabilistic choice combinator is not contractive for 

Continuous systems. While this paper focuses on systems with a countable number of states, all the results 
extend to systems with continuous state spaces. The technical development of continuous systems requires 
measure theory apparatus to develop analogs of the results in section 3 4 and will be reported in a separate 
paper. 

Related and future work. In this paper, we deal with probabilistic nondeterminism. In a probabilistic 
analysis, quantitative information is recorded and used in the reasoning. In contrast, a purely qualitative 
nondeterministic analysis does not require and does not yield quantitative information. In particular when 
one has no quantitative information at all, one has to work with indeterminacy — using a uniform probability 
distribution is not the same as expressing complete ignorance about the possible outcomes. 

The study of the interaction of probability and nondeterminism, largely in the context of exact equiv- 
alence of probabilistic processes, has been explored extensively in the context of different models of con- 
currency. Probabilistic process algebras add a notion of randomness to the process algebra model and have 
been studied extensively in the traditional framework of (different) semantic theories of (different) process 
algebras (to name but a few, see [HJ90, JY95, LS91, HS86, BBS95, vGSS95, CSZ92]) e.g . bisimula- 
tion, theories of (probabilistic) testing, relationship with (probabilistic) modal logics etc. Probabilistic Petri 
nets [Mar89, VN92] add Markov chains to the underlying Petri net model. This area has a well developed 
suite of algorithms for performance evaluation. Probabilistic studies have also been carried out in the context 
of 10 Automata [Seg95, WSS97], 

In contrast to the above body of research the primary theme of this paper is the the study of intersubsti- 
tutivity of (eventually) (approximately) equivalent processes. The ideas of approximate substitutivity in this 
paper are inspired by the work of Jou and Smoka [JS90] referred to earlier and the ideas in the area of perfor- 
mance modeling as exemplified in on the work on process algebras for compositional performance modeling 
(see for example [Hil94]). The extension of the methods of this paper to systems which have both probability 
and traditional nondeterminism remains open and will be the object of future study. 

The verification community has been active in developing model checking tools for probabilistic systems, 
for example [BLL+96, BdA95, BCHG+97, CY95, HK97]. Approximation techniques in the spirit of those of 
this paper have been explored for hybrid systems [GHJ97]. In future work, we will explore efficient algorithms 
and complexity results for our metrics. 

Our work on the asymptotic metric is closely related to, at least in spirit, the work of Lincoln, Mitchell, 
Mitchell and Scedrov [LMMS98] in the context of security protocols. Both [LMMS98] and this paper 
consider the asymptotic behavior of a single process, rather than the limiting behavior of a probabilistically 
described family of processes as is performed in some analysis performed in Markov theory. 

Organization of this paper The rest of this paper is organized as follows. First, in section 2, we review 
the notions of plMc and probabilistic bisimulation and associated results to make the paper self-contained. 
We next present (section 3) an alternate way to study processes using real-valued functions and show that 
this view presents an alternate characterization of probabilistic bisimulation. In section 4, we define a family 
of metrics, illustrate with various examples and describe a decision procedure to evaluate the metric. The 
following section 5 describes a process algebra of probabilistically determinate processes. We conclude with 
a section 6 on the asymptotic metric. 

4 ln particular the results on finite detectability of logical satisfaction. 
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2 Background 

This section on background briefly recalls definitions from previous work [BDEP97, DEP98, LS91 ] on partial 
labeled Markov processes and sets up the basic notations and framework for the rest of the paper. Our 
definitions are for discrete spaces, see [BDEP97] for the continuous space definitions. 

Definition 2.1 A partial labeled Markov chain fplMcj with a label set L is a structure (S, {/>'( \ l £ L},s), 
where S is a countable set of states, s is the start state, and VI £ L.ki : S x 5 — > [0, 1] is a transition function 
such that Vs G S. 52t kfs, t) < 1. 

A plMc is finite if S is finite. 

There is no finite branching restriction on a plMc; k(s,t) can be non-zero for countably many V s. k t is 
extended to a function S x V(S) — > [0,1] by defining: k t {s, A) = Given a PlMc P = 

( S , {k[ | l G L},s), we shall refer to its state set, transition probability and initial state as S P , kf and s P 
respectively, when necessary. 

We could have alternatively presented a plMc as a structure (5, {&/ | l G L},/z) where /z is an initial 
distribution on 5. This notion of initial distribution is no more general than the notion of initial state. Given 
a plMc with initial distribution P, one can construct an equivalent plMc with initial state Q as follows. 
S Q = S P U {it} where u is a new state not in S P . u will be the start state of Q. kf(s, t) = kf(s,t) if 
s,t G S P \ kf{s,u) = 0, and kf{u,t) = 52 kf (s, t)/j, p {s). We will freely move between the notions of 
initial state and initial distribution. For example, when a transition on label l occurs in a plMc P, there is a 
new initial distribution given by = 52 ^t ( s > f) x m( s )- 

We recall the definition of bisimulation on plMc from [LS91]. 

Definition 2.2 An equivalence relation, R, on the set of states of a plMc P is a bisimulation if whenever 
two states si and s 2 are R-related, then for any label l and any R-equivalence class of states T, ki{si,T ) = 
ki{s2,T). 

Two plMcj P, Q are bisimilar if there is a bisimulation R on the disjoint union of P, Q such that s P R sq. 

In [DEP98] it is shown that bisimulation can be characterized using a negation free logic C: T\fi A 
<h\(a)q<f>, where a is an label from the set of labels L and q G [0, 1) is a rational number. Given a plMc 
P = (S, H,k a ,s) we write t \= P <p to mean that the state t satisfies the formula <f>. The definition of the 
relation |= is given by induction on formulas. 

t\=P T 

t \= P 4>1 A 4>2 ^ t\=pf 2 

t^p(a) q <p 3ACS.(Vt' eA.t' \= P <p)A{q< k a {t,A)). 

In words, t \=p {a) q f if the system P in state t can make an a-move to a set of states that satisfy <j> with 
probability strictly greater than q. We write J^Jp for the set {s G Sp|s |= <f>}. We often omit the P subscript 
when no confusion can arise. The results of [DEP98] relevant to the current paper are: 

• TwoplMcs are bisimilar if and only if their start states satisfy the same formulas. 

• [DEP98] also shows how to construct the maximal autobisimulation on a given system. In the finite 
state case, this yields a state minimization construction. 

The following example helps to illustrate some of the key aspects of the logic. 
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Figure 1: Two processes which cannot be distinguished without negation in HML. 

Example 2.3 (Example from [DEP98]) Consider the processes shown in figure L They are both nonproba- 
bilistic processes. It is well known that they cannot be distinguished by a negation-free formula of Hennessy- 
Milner logic ; the process on the left satisfies (a)->{b)T while the process on the right does not. However, for 
no assignment of probabilities are the two processes going to be bisimilar. Suppose that the two a-labeled 
branches of the left hand process are given probabilities p and q, assume that the b-labeled transitions have 
probability 1. Now if the right hand process has its a-labeled transition given a probability anything other 
than p + q, say r > p + q we can immediately distinguish the two processes by the formula {a) v + q T which 
will not be satisfied by the left hand process. Ifr = p + q then we can use the formula {a) r f (b) 1 / 2 T where 
q < r l < r. The left hand process cannot satisfy this formula but the right hand one does unless p = 0 in 
which case the processes are bisimilar. 


3 An alternate characterization of probabilistic bisimulation 

In this section, following Kozen [Koz85], we present an alternate characterization of probabilistic bisimulation 
using functions into the reals instead of the logic C. We first show that for countably infinite plMcs, we can 
work with their finite sub-plMcs. Then we define a set of functions which are sufficient to characterize 
bisimulation. It is worth clarifying our terminology here. We define a set of functional expressions by giving 
an explicit syntax. A functional expression becomes a function when we interpret it in a system. Thus we 
may loosely say “the same function” when we move from one system to another. What we really mean is 
the “same functional expression”; obviously it cannot be the same function when the domains are different. 
This is no different from having syntactically defined formulas of some logic which become boolean- valued 
functions when they are interpreted on a structure. 

Logical satisfaction is finitely detectable 

Definition 3.1 P is a sub-plMc of Q if Sp C Sq and (V/) [kf (s, t) < kf (a, t)} 

Thus, a sub plMc of a plMc has fewer states and lower probabilities. The logic £, since it does not have 
negation, satisfies a basic monotonicity property with respect to substructures. 

Lemma 3.2 lfP is a sub-plMc of Q, then (Vs € Sp) [s |=p <f> => s \ =q <j>] 

Proof. The proof proceeds by induction on <f. It is immediate for T and conjunction. Let s \=p (a) q .ip. 
Then, we deduce: 

s \= P {a) q .xp => 9 < fcf (s, JV'Jp) 

=>■ q < ka{s, It/’Jp) P is a sub-plmc of Q 
=> q < /c?(s, |i/> Jq) by induction on t/> 

=► 5 j=Q (a) Q .ip. 
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Figure 2: Examples of plMcs 


Every formula satisfied in a state of a plMc is witnessed by a finite sub-plMc. 

Lemma 3.3 Let P be a plMc, s € S>, such that s \=p <f>. Then there exists a finite sub-plMc of P, Q 0I 
such that s E Sq, and s (=g <fi. 

Proof. The proof is by induction on (f>. For T, the one state plMc containing s suffices. For <f>\ A fi 2 , we 
take the union of the finite plMcs, Q 0i . Q s 02 given by the induction hypothesis. Note that lemma 3.2 ensures 
that the plMc so constructed satisfies s 4>\ and s <f> 2 . 

Let s \= P (a) q .ip. Then, since q < k£{s, Jt/>]jp), there is a finite subset U = {si,... ,s n } C |t/)jp, 
such that q < k£(s, U). The required finite plMc, ^ is now constructed by taking the unions of the 
finite plMcs, Q 0 ,- ■ ■ , Q^, adding state s and the transitions from s to Si for i = 1 . . . n. I 

We now give the class of functional expressions. First, some notation. Let [r\ q = r - q if r > q, and 0 
otherwise. [>] 9 = q if r > q, and r otherwise. Note that |rj 9 + H 9 = r. 

For each c € (0, 1], we consider a family T c of functional expressions generated by the following gram- 
mar. Here q is a rational in [0, 1]. 

f c ::= As.l Constant schema 

| \s. min(/f(s),/ 2 (s)) Min schema 

j Xs.c x ^2 teS k a (s,t)f c (t) Prefix schema 
I As.[/ c (s)J 9 I Xs.\f c {s )] 9 Conditional schema. 

The functional expressions generated by these schemas will be written as l,ram(/i,/ 2 ), (a)./, [f\ g and 
f/] 9 respectively. One can informally associate functional expressions with every connective of the logic C in 
the following way — the precise formalization will be presented in lemma 3.7. T is represented by As.l and 
conjunction by min. The contents of the connective (a), is split up into two expression schemas: the ( a).f 
schema that intuitively corresponds to prefixing and the conditional schema [f\ q that captures the “greater 
than q ” idea. 

Given a plMc P, any expression f c € T c induces a function fp : Sp — ► [0, 1]. 

Example 3.4 Consider the plMcs A\ and A 2 of figure 2. All transitions are labeled a. The functional 
expression ((a).l) c evaluates to c at states s 0 , s 2 of both Ai and A 2 ; it evaluates to 0 at states s\, S 3 of A x 
and S 3 , s 4 of A 2 , and it evaluates to c/2 at state s x of A 2 . The functional expression ((a), (a). 1 ) c evaluates to 
3 c 2 /4 at states s 0 of A X ,A 2 and to 0 elsewhere. The functional expression ((a), [(a).lj 1 ) c evaluates to 3c 2 /8 

at state so of A\ and to c 2 /4 at state so of A 2 . 
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Example 3.5 Consider the plMc A 2 of figure 2. All transitions are labeled a. Afunctional expression of the 
form ((a) (a) ,l) c evaluates to c n at state s 0 . On state s 0 of plMc A 4 the same functional expression 


evaluates to (c x 0.4) n . 

The following lemma is the functional analog of lemma 3.2. 

Lemma 3.6 // P is a sub-plMc of Q, then (Vs £ Sp) [ fp{s ) < /q(s)]- 

The proof is a routine induction on the construction of the functional expression f c and is omitted. 

Lemma 3.7 Given any <p e £ and a finite plMc P, and any c £ (0, 1], there is a functional expression 
f c £ T c such that 

1. Vs £ Sp.fp(s) > 0 iff s | =p <t>- 

2. /or any plMc Q, Vs £ Sq.s )£p <p => Iq{ s ) = 0- 

Proof. The proof is by induction on the structure of </>. If <p = T, the functional expression As.l suffices. If 
- ipi A ip 2 , let fi and /£ be the functional expressions corresponding to ip\ and t/> 2 . Then the functional 
expression As. min(/f (s), (s)) satisfies the conditions. 

If <p = (a) q .ip, let g c be the functional expression corresponding to ip yielded by induction. Let Sy be the 
set of states in P satisfying ip, and let x = min{p(s) | s £ S^,}. By induction hypothesis, x > 0. Consider 
the functional expression f c given by |_^a) . x J ca:®- F° r all t € [V»|, = x - N° w ^ or ^ state 

s € Sp y 

M«)-fol*) c («) =c * k a {s,t) =cxk a {s,\ip]). 

<€ [l/-J 

Now for each state s € |$J, k a (s, [t/’J) > 9- Thus f c satisfies the first condition. 

The second condition holds because for any state 5 in Q> ((a), |y| x ) {s) < cxk a {s , | 1 so if /c a ^ 

qthen (L<a).[ 5 l*J cxg )(s) = 0. * 

Corollary 3.8 For any plMc P and state s £ Sp, if s \= P <p then there exists f c £ T c such that f c p (s) > 0 
and (VplMc R) (Vs € Sr) fp($) > 0 =*■ s (=* <p. 

Proof. Let s be a state in plMc P such that s | =p <j>. By lemma 3.3, there is a finite sub-plMc Q of P 
such that s | =q <f>. By lemma 3.7, 3f c £ T c such that /£(s) > 0 and for any plMc R, Vs € Sr.s ¥= <p => 
f c R ( 5 ) = 0. By lemma 3.6, f c p {s) > 0, so f c satisfies the conditions required by the lemma. ■ 

Theorem 3.9 For any plMc P, (Vc £ (0, 1]), Vs, s' £ Sp 

[(V/ £ £) s (=p cp s' \ =p <p] (V/ € T c ) [fp{s) = fp{s')}. 

Proof. Let (V/ £ £) s \= P <p & s' \= P <p. Then, by the results of [DEP98], there is a bisimulation R 
such that s, s' are in the same equivalence class. We now show that for any bisimulation R, sRs implies that 
(V/ € T c ) [fp(s) = fp{s')]. The proof proceeds by induction on the structure of the function expression f c . 

The key case is when f c is of the form ((a).g) c . Let E, be the i?-equi valence classes. Then. 

f c p(s) = CX Zteska(s,t)g c (t) 

= cx £ i£t e £vMM)s c (f ) 

= c X £,[/(£,) X k a {s, £;)] by induction, g c is constant on E t 
= cx £;[//£,) x k a (s', Ei )] from sRs' 

= cx ZiZteE^ais'RWit) =f‘p(s'). 

For the converse, let <fi be such that s \=p <p an< ^ V^p ®y cor °H ar y 3.8, there is a functional expression 
f c such that fp{s) > 0 and fp{s') = 0. ® 
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Example 3.10 Consider the plMcs Ai,A 2 of figure 2. The calculations of example 3.4 show that the s 0 
states of A\.A 2 are distinguishable. Furthermore, the states are indistinguishable if we use only the function 
schemas Constant. Min and Prefixing. Thus, example 3.4 shows that the conditional functional expressions 
are necessary. 


4 A Metric on Processes 

Each collection of functional expression T c be the set of all such expressions induces a distance function as 
follows: 

d c (P, Q) = sup |/£M -/§(*?) |. 

f c €f c 


Theorem 4.1 For all c € (0, 1], d c is a metric. 

Proof. The transitivity and symmetry of d c is immediate. d c {P , Q) = 0 iff P and Q are bisimilar follows 
from theorem 3.9. ■ 

Example 4.2 The analysis of example 3.10 yields d c (A\,A 2 ) = c 2 /8. 

Example 4.3 Example 3.5 shows the fundamental difference between the metrics d c ,c < 1 and d l . For 
c < 1, d c (A 3 , A 4 ) is witnessed by {(a). l) c and is given by d c (A 3 , A 4 ) = 0.6c. In contrast, d l (A 3 . A 4 ) = 
sup{l - (0.4) n | n = 0, 1, . . . } = 1. What this shows is that the notion of convergence is different for the two 
metrics. If we had a family of processes like A4 with the transition probability given by 1 - A. the distance of 
these processes from A3 would always be 1, hence they would not converge to A3 in the dd metric, but they 
would converge to A3 in any d c metric with c < 1. Thus the d l metric defines a different topology than do the 
other metrics. 

Example 4.4 (Analysis of Example 1.4) Consider the family o/plMci {P ( | 0 < e < r} where P ( = a r - t .Q, 
i.e. P ( is the plMc that does an a with probability r - e and then behaves like Q. The function expression 
({a).l) c evaluates to (r - e)c at P ( . This functional expression witnesses the distance between any two P's 
(other functions will give smaller distances). Thus, we get d(P ei , P (2 ) = c|e x - e 2 |. This furthermore ensures 
that P t converges to Po as e tends to 0. 

Example 4.5 (from [DEP98]) Consider the plMcj P (left) and Q (right) of figure 3. Q is just like P except 
that there is an additional transition to a state which then has an a-labeled transition back to itself. The 
probability > numbers are as shown. If both plMcs have the same values on all functional expressions we will 
show that Qoo = 0, i.e. it really cannot be present. The functional expression ((a). l) c yields c(^i> 0 P» ) 
on P and c(<?oo + £i>o9i) on Q. The functional expression ((a).<a).l) c yields c 2 (^j> x pi) on P and 
c 2 (q 0O + V\ >2 qi ) on Q. Thus, we deduce that p 0 = qo ■ Similarly, considering functional expressions 
((a).(a).(a).f) c etc, we deduce thatp n = q n . Thus, qoo = 0. 


A decision procedure for d c , c < 1. Given finite plMcs P, Q, we now provide a decision procedure for 
computing d c (P,Q) for c < 1 to any desired accuracy c n , where n is a natural number. We do this by 
computing sup F \f c {s P ) - / c (sq)| for a finite set of functions F, and then show that for this F, d c {P. Q) - 
sup F \f c (sp) - / c (sq)| < c". 

Define the depth of a functional expression inductively as follows: depth(Xs.l) = 0, depth(min(}{, f%)) = 
max{depth(fi). depth) j £)) and depth([f c J 9 ) = depth(\f c ] q ) = depth(f c ),depth{(a).f c ) = depth(f c )+ 
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Figure 3: Probability and countable branching 


1. Then it is clear that |/ c (sp) — / c ('Sq)| < Now if we include in F all functions of depth < n, 

then d c {P,Q) - sup F \f c {s P ) - f c {s Q )\ <c n . 

However there are infinitely many functional expressions of depth < n. We now construct a finite subset 
of these, such that the above inequality still holds. Let A 1 = { 3 m +iV n -; | k = 0, . . . 3 TTI+1+n_l }, where 
1/3™ < c n . We construct the set of functions inductively as follows. Let F l be the set of all functions of 
depth < i. Define: 

Fi +1 = {(a)./ | / € F 1 '} 

H +1 = U/J 9 I / € Fft 1 ,q e v4 i+1 } 

F r+i = F n U {[/l« | / eFi + \qeA i+1 } 

F l+1 is defined by closing F ^ +1 under pairwise mins. 

We can prove that for any f c € J~ c of depth < n, there is a function in F n that approximates it closely 
enough. 

Lemma 4.6 Let f c € T c be of depth i < n. Then, there exists g c j € F l such that: 

(VplMc P) (Vs € S P ) [|/ c (s) - g e f (s)\ < 3 ^ 7 ]. 

Proof. The proof proceeds by induction on i. In this extended abstract, we only sketch the two basic ideas of 
the proof for the inductive step. 

(1) The following identities show that repeating steps 2, 3, 4 on F ,+1 does not get any new functions. 

LL/jfljf = l/j g+r rr/iT = r/r in(9ir) 

[\fV\r = FL/Jrl 9 - r 

[mm(/i,/ 2 )Jr = mm(L/iJ r , [, f 2 \ r ) f 2 )Y = min{\fi ] T , r/ 2 l r )- 

(2) Define fi,f 2 to be e-close if for all states s E Sp U Sq, |/i(s) — f2(s)\ < f - Then if f\ and fz are e-close, 
then (o)./i and (a).fz are e-close, and so are |_/ij 9 and L/ 2 J and also f/il 9 and f/ 2 ] 9 . In addition if /{ and 
f' 2 are also e-close, then min(f\, /{) and min{f 2 , ff) are also e-close. Furthermore, 

\qi ~ 92 1 < c =► sup{|[/J„(x) - L/J^WI} < c- 

Similarly, for ® 


5 Examples of metric reasoning principles 

In this section, we use a process algebra and an example coded in the process algebra to illustrate the type of 
reasoning provided by our study. 
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5.1 A process algebra 

The process algebra describes probabilistically determinate processes. The processes are input-enabled [LT89, 
Dil88, Jos92] in a weak sense ((Vs G Sp) (Va € L) k a ->(s,S P ) > 0) and communication is via CSP style 
broadcast. The process combinators that we consider are parallel composition, prefixing and probabilistic 
choice. We do not consider hiding since this paper focuses on strong probabilistic bisimulation. Though we 
do not enforce the fact that output actions do not block, this assumption can safely be added to the algebra to 
make it an 10 calculus [Vaa91]; this change does not alter the results of this section. 

We assume an underlying set of labels A. Let LI — {a? | a G .4} be the set of input labels, and 
L\ = {a! | a G A) the set of output labels. The set of labels are given by L = LI U L\, Every process P 
is associated with a subset of labels: Po Q L\, the set of relevant output labels. This signature is used to 
constrain parallel composition. 

Prefixing. P = a? r .Q where r is a rational number, is the process that accepts input a and then performs 
as Q. The number r is the probability of accepting a?. With probability (1 - r ) the process P = a? r .Q will 
block on an a ? label. Sp is given by adding a new state, q to Sq. Add a transition labeled a? from q to the 
start state of Q with probability r. For all other labels /, add a /? labeled self-loop at q with probability 1. q is 
the start state of P. 

Output prefixing, P = a\ r .Q , where r is a rational number, is the process that performs output action a! 
and then functions as Q, is defined analogously. In this case, Po = Qo U {a!}. 

Probabilistic choice. P = Q + r Q' is the probabilistic choice combinator [JP89] that chooses between 
Q, Q'\ Q is chosen with probability r and Q' is chosen with probability 1 - r. 

Po - Qo U Q' 0 . S p = Sq W Sq>. Now kf{q,A W A') = k?{q,A) if q G Sq, and kf (q,A W A') - 
kf{q,A') if q G S Q '. In this case, we define an initial distribution p: p({sq}) = r,^({sg'}) = 1 - r, 
referring the reader to section 2 for a way to convert the initial distribution to an initial state. 


Parallel composition. P = Q \ \ Q' is permitted if the output actions of Q, Q' are disjoint, i.e. Qo^Q’o — 

0. The parallel composition synchronizes on all labels in Ql H Q' l . 

p 0 = Qo W Q'o- Sp — Sq x Sq'. The kf definition is motivated by the following idea. Let s (resp. s ' ) 
be a state of Q (resp. Q'). We expect the following synchronized transitions from the product state (s, s')- 


s t s' j 
(s, s') 


s — t s' t 
(5) s') (M') 


s t 

(s, s') (M') 


The disjointness of the output labels of Q, Q' ensures that there is no non-determinism. Formally, if l = a! G 
Qo, then s'), (t, t')) = kf {(s,s'),(t,t')) = k%,t) x k%{s',t'). The case when o! G Q ' 0 and / = a? 

is similar. 

We now show that each of the operations of the process algebra are contraction mappings with respect to 
the metric defined above. Since theorem 3.9 shows that d{P , Q) = 0 iff P « Q, this shows that bisimulation 
is a congruence with respect to these operations. 


Theorem 5.1 The following hold: 


1. d c (l r .P,l r -Q) < cd c (P,Q) for any label l. 

2. d c (P -Hr R, Q +r R) < d c (P , Q) for any R. 

3. d c (P || R. Q || R) < d r ( P. Q) for any R for which the processes on the left are defined. 
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put,p 



(a) Producer 


get,q get?,l-r put?,e 



(b) Consumer (c) Buffer, size 2 


get,q( 1 -r) put.pe 



(d) Producer || Consumer || Buffer 2 
Figure 4: The producer consumer example. 


Proof. The proof proceeds by induction on functional expressions. Let d c j c {P,Q ) be the distance using 
fc _ q) _ sup/c d f c(P, Q). We show that for any f c d c fc of the LHS is less than or equal to some 
d c gC of the RHS. In this extended abstract, we omit the detailed calculations. The key case is f c = ( a).h c , 
sketched below for the parallel composition. If a — bl, and 6! € Ro. then by induction, that we know that 
d c hc (P' \\R',Q' || R') <d c g c{P',Q'), where P',Q\ R' are the same as P, Q, R but with the start distribution 
obtained by making a 6! transition on the start state in the case of R, and a 6? transition in the case of P, Q. 
Now d c fc (P\\R,Q\\R) = cx d c hc ( P ' 1 1 R\ Q' \ \ R!) < c x d c gC (P', Q 1 ) = d c (a)gC (P, Q). ■ 

Lemma 5.2 The following properties are true of our metric: 

1. d c (a T .P, a s .P) < c | r — s |. 

2. d c (P + r Q,P+sQ)<\r-s\ d c (P, Q). 

3. d c {P +r Q, P' +r Q) < rd c (P, P'). 


5.2 A bounded buffer example 

We specify a producer consumer process with a bounded buffer (along the lines of [PS85]). The producer is 
specified by the 1 state finite automaton shown in Figure 4(a) — it outputs a put, corresponding to producing 
a packet, with probability p (we omit the ! in the labels). To keep the figure uncluttered, we also omit the 
input-enabling arcs, all of which have probability 1. The consumer (Figure 4(b)) is analogous — it outputs a 
get with probability q, corresponding to consuming a packet. The buffer is an n-state automaton, the states 
are merely used to count the number of packets in the buffer, while the probabilities code up the probability 
of scheduling either the producer or the consumer (thus the producer gets scheduled with probability r , and 
then produces a packet with probability p). Upon receiving a put in the last state, the buffer accepts it with a 
very small probability e, modeling a blocked input. The parallel composition of the three processes is shown 
in Figure 4(d). 

As the buffer size increases, the distance between the bounded buffer and the unbounded buffer decreases 
to 0. Let Pk = Producer || Consumer || Buffer*, where Buffer*, denotes the process Buffer with k states. 
Then by looking at the structure of the process, we can compute that d(P*, Poo) oc ( cpr ) k . This allows us to 
conclude the following: 
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Figure 5: A producer with transient behavior 


• As the bounded buffer becomes larger, it approximates an infinite buffer more closely: if m > k then 

d c (P k .P x )>d c (P m .Poe). 

• As the probability of a put decreases, the bounded buffer approximates an infinite buffer more closely. 
Thus if p< p\ d c (P p . PSo) < d c (P p ',P&), where the superscripts indicate the producer probability. 

• Similarly, as the probability of scheduling the Producer process (r) decreases, the buffer approximates 
an infinite buffer more closely. 


6 The asymptotic metric 

Let P be a plMc. Then P after a is the same plMc but with start distribution given by v(t) = k a {s, t). 
We perform some normalization based on the total probability of the resulting initial configuration v ( 5 ) : If 
u(S) > 0, it is normalized to be 1; if u(S) = 0, it is left untouched. 

This definition extends inductively to P after s, where s is a finite sequence of labels (a 0 , ai, a 2 , . . . , a*). 
Note that P after s is identical to P except that its initial configuration may be different. 

Define the j distance between P,Q, d c j(P,Q) = sup{d c (P after s,Q after s) | length{s) = j}. We 
define the asymptotic distance between processes P and Q , d^,(P, Q) to be 

<£o(P, Q) = limsup<fj(P,Q). 

i — >oo j>i 

The fact that d ^ satisfies the triangle inequality and is symmetric immediately follows from the same proper- 
ties for d. 

Example 6.1 For any plMc P, d^(a r .P, a s .P ) = 0, where r, s > 0. Consider A 3 from Figure 2. Without 
the normalization in the definition of A 3 after s, we would have got d£o(a r .A 3 , a s .A 3 ) = c\r - s| 

Example 6.2 Consider the producer process P 2 shown in Figure 5. This is similar to the producer P\ in 
Figure 4, except that initially the probability of producing put is more than p, however as more put ’s are pro- 
duced, it asymptotically approaches p. If we consider the asymptotic distance between these two producers, 
we see that d c {P 2 after put n . P x after put n ) a 2-> +1 >. Thus d c 00 (Pi,P 2 ) = 0. Now by using the composi- 
tionality of parallel composition (see below) , we see that d^(Pi || Consumer || Buffer k , P 2 || Consumer || 
Buffer k ) = 0, which is the intuitively expected result. 

Parallel composition and prefixing in the process algebra are contraction mappings with respect to the 
metric defined above — this will show that asymptotic equivalence is preserved by these operations. 

Theorem 6.3 The following hold: 

1. dlcilr-PJr-Q) < c£o(P,Q ) M any label l. 

2. d^iP || R,Q || R) < dooiP.Q). 

For the key case of parallel composition, the proof is based on: (P || Q ) after s = (P after si) || ( Q after sf), 
where sj has those a! labels of s replaced by a? where a! £ Po, and similarly for s 2 - 


Acknowledgements. We have benefited from discussions with Franck van Breugel about the Hutchinson 

metric. 
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